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Abstract The total duration of drawdowns is shown to provide a moment-free, unbiased, efficient 
and robust estimator of Sharpe ratios both for Gaussian and heavy-tailed price returns. We then use 
this quantity to infer an analytic expression of the bias of moment-based Sharpe ratio estimators 
as a function of the return distribution tail exponent. The heterogeneity of tail exponents at any 
given time among assets implies that our new method yields significantly different asset rankings 
than those of moment-based methods, especially in periods large volatility. This is fully confirmed 
by using 20 years of historical data on 3449 liquid US equities. 
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Introduction 


Sharpe ratios ( Sharpe|1964 ) appear naturally in financial analysis for a good reason: 
they are nothing else than signal-to-noise ratios, a fundamental quantity in signal 
analysis. In a Gaussian world, they are also equivalent to the t-statistics. Finance 
is not an ideal world, however, and many problems arise in practice. Sharpe ra¬ 


tio’s distribution (Lo 2002), bias (Miller and Gehr 1978 Jobson and Korkie 1981) 


and corrections due to serial correlations (Lo 2002 Mertens 2002 Christie 2005 


Opdyke 2007) have been characterized. Better estimating methods use the Gener¬ 
alized Moments Method ( Lo||2002 Christie||2005 ) and block bootstraps (Ledoit and 
Wolf 2008). Although Sharpe ratios only depend on the first and second moments 
of price returns, their variance depends on the third and fourth moments (Lo||2002 


Mertens 2002 Christie||2005 Opdyke 2007). Given the definition of the Sharpe ra¬ 
tios, it is not surprising that all these methods rely on the computation of moments 
of price returns. But as noted e.g. in Opdyke (2007), this may be problematic as the 


fourth moment may not be defined ( Dacorogna et aL]|2001 Jondeau and Rockinger 
2003). Finally, the standard estimator of the Sharpe ratio is known be biased for 


heavy-tailed price returns. Once again, the corrections proposed e.g. for the Deflated 


Sharpe Ratio, depend on the third and fourth moment (Bailey and Lopez de Prado 


2014). 


Here, I propose a new way to estimate Sharpe ratios that does not require the 
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Figure 1. Example log price time series (black lines), its running maximum (blue dashed lines), and 
running minimum (green dashed lines). The number of upper (lower) records (77-) is equal to the 
number of jumps of the running maximum (minimum) plus one since the first point counts as a record by 
convention: here 77+ = 4 and /7_ = 7. The total drawdown duration is T_ = 16 and the total drawup 
duration is T+ = 13. Clearly, 77+ + T_ = 77- + T+ =20 + 1, the number of returns plus one. 


computation of any moment and that may be extended to estimate the drift of time 
series with infinite variance. It is based on the fact that the total duration of all 
drawdowns in a price time series of a given length is a monotonic function of the 
Sharpe ratio; by symmetry, the same holds for the total duration of all drawups. 
As a consequence, one may estimate Sharpe ratios by computing the difference 
between the total durations of drawups and drawdowns. This quantity is bounded 
by definition and leads to an estimator that is both robust to outliers and more 
efficient than direct estimates of Sharpe ratios for heavy-tailed data. 

Above all, the new estimator is unbiased for heavy-tailed data, in contrast to the 
standard estimation method. Even more, we propose that the Sharpe ratio depends 
in a simple way of total drawdown durations and return distribution tail exponent, 
which allows a direct estimation of the bias of the moment-based estimator at fixed 
total drawdown duration length. This gives a new take on asset ranking: because 
all the asset price return distributions have different tail exponents at any point in 
time, the new method yields considerably different asset rankings, especially at the 
top and bottom quantiles, and during the most volatile periods. Thus this paper 
contributes both theoretically and empirically to the on-going debate about the 


relevance of the ranking method ( 

Eling and Schuhmacher 

20071 

Zakamulin 

2010 

Schuhmacher and Eling||201H |Ornelas, Silva Junior, and Fernandes||2012; Auer and 

Schuhmacher 

2013a|bl) 


Intuitively, the sum of all drawdown durations, i.e., the total drawdown duration 
of a time series of hxed length, is linked to the number of upper price records since a 
new price return pushes the price either to an all time high (a new upper record) or 
to a drawdown (see Fig. [^. This implies that if n is the length of a price time series 
and is the number of its upper records (i?+ > 1 because the first point is a record 
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by convention), the total drawdown duration, denoted by T_, is T_ = n — (i?+ — 1). 
Because of this equivalence, total drawdown/up duration and the numbers of price 
records lead to two equivalent estimators; accordingly, we will use either wordings. 
Assuming that log prices are simple random walks, drawdown/up durations are 
determined by hrst-passage times, themselves derived from persistence (or survival) 
properties (Redner||2001 ). The connection between persistence and price dynamics. 


especially in the context of market microstructure, is well known (|Lo, MacKinlay, 
and Zhang||2002 Eisler et al. 2009). 


Persistence is at the core of a noteworthy recent result about discrete-time unbi¬ 
ased random walks. In a financial context, it may be stated as follows: the distribu¬ 
tion of the number of upper (or lower) records of a price time series with independent 
and identically distributed return (i.i.d.), of a fixed length, does not depend on the 


increment distribution provided that the latter is symmetric and continuous (Ma- 


jumdar and Ziff 2008). This universality is behind the robustness and power of the 
r-statistics, a family of statistics based on the number of records of a time series, 
which not only provides a powerful non-parametric location test ( Challet||2015 ) but 
also, as shown here, an efficient estimator of Sharpe ratios. Their robustness come 
from the fact that the influence of outliers is much dampened because sample values 
are transformed into an integer number with bounded admissible values. 


Majumdar, Schehr, and Wergen (2012) show that the distribution of the number 


of records converges to a Gaussian distribution in the limit of infinitely long time se¬ 
ries provided that the price return distribution has a hnite variance. Even better, the 
support of the hnite-size sample distribution of the new estimator is bounded, con- 
trarily to that of Sharpe ratios (and t-statistics), and is accordingly more peaked 
than a normal distribution (Challet 2015). When the true Sharpe ratio is differ¬ 
ent from zero, the expected number of records and its variance are distribution- 
dependent; exact expressions are only known for exponentially distributed incre¬ 
ments, hence one has to resort to approximations and numerical simulations for 
other types of distribution in the limits of large and small Sharpe ratios. 

Drawdown durations are by definition integer numbers, which is not optimal to 
estimate a real number. The solution comes from random permutations. Assuming 
that the price returns are i.i.d., one can shuffle their order at will and compute the 
resulting price time series, which is an equally valid representation of a given set of 
price returns and most likely have a different number of upper and lower records. 
Thus, to obtain a more precise estimate of the Sharpe ratio, one takes the average 
of the difference between the total drawdown and drawup durations over many such 
permutations (see Eig. [^for a graphical explanation). 

The structure of this paper is as follows: Section 2 introduces the necessary no¬ 
tations to define price record statistics and shows that when prices have a positive 
trend, heavy-tailed increments lead to a larger number of upper price records than 
Gaussian increments; a mathematical derivation of the expected number of price 
records for Student’s t-distributed increments is reported in Appendix A, which 
focuses on the case of tail exponent equal to 4 (3 degrees of freedom) for the sake 
of analytical tractability. Section 3 investigates the efficiency of the number of price 
records as Sharpe ratio estimators relative to the vanilla estimator and shows that 
the new estimator is several times more efficient than moment-based methods for 
heavy-tailed variables and almost as efficient as the vanilla estimator in the case 
of Gaussian variables; it then derives a simple equation that simplihes the calibra¬ 
tion of the relationship between true Sharpe ratio and estimated tail exponents and 
number of records. Section 4 uses an unbiased historical data set of 3449 liquid US 
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equities to estimate 100-day rolling Sharpe ratios with both methods. It turns out 
that in leptokurtic times, the estimates from both methods may differ very signifi¬ 
cantly because the vanilla Sharpe ratio estimator is not only more volatile, but also 
systematically overestimates the information content of price time series that have 
heavy-tailed returns. 


2. Record statistics of random prices 

Financial data exist in discrete time, which will be the point of view adopted in this 
paper. Let us assume that the initial log price is Sq = 0 and that its value at time 
A: > 0 follows 


Sk — Sk-i + rfc -|- c 


( 1 ) 


where Vk is the increment at time k, assumed to be identically and independently 
drawn from a continuous distribution P{r), and c is a constant trend. Let the 
running maximum Mk = maxi<t<fc St (see Fig. [^. The number of upper records of 
a time series of length n is the number of jumps of M„, which by convention always 
includes Mi; it will be denoted by R+ and its distribution by P{R+,n). In the same 
spirit, one defines R-, the number of lower records, as the number of jumps of the 
running minimum. 


Majumdar and Ziff ( 

2008 

);|Le Doussal and Wiese ( 

2009) 

Majumdar, Schehr, and 


Wergen (2012) demonstrate that many quantities of interest are fully characterized 


by the persistence function q-{n) of the process, i.e., the probability that the price 
has never exceeded its starting value after n steps. It is advantageous to work with 
its characteristic function q-{z) = J2n>o 
For example, the characteristic function of P{R+,n) is (Majumdar and Ziff|2008) 


P{R+,z) = q.{z)[l - (1 - z)q.{z)f^-\ 


while the characteristic function of the expected number of upper records m+{ri) = 
£'(i?+)(n) can be written as m+{z) = [(1 — zy‘q-{z)\~^ (Le Doussal and Wiese 
20 ^ . 


Generalized Sparre Andersen theorem ( Anders^ 1953; Feller 2008) provides a 
constructive way to compute for any continuous and symmetric P{r), 


log {q-(z)) = '^—P{Sn <0). (2) 

n 

n=l 


2.1 Driftless prices 

A direct consequence of this theorem is the universality of the unbiased case c = 0 
since P{Sk < 0) = ^ for all symmetric and continuous distributions, as indeed 
q±{z) = q{z) is the same for all such distributions and 


P{R,n) 


^2n - R + 1 '^ ^_ 2 n + R-l 
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where R may either be i?+ or R-, by symmetry (Majumdar and Ziff 2008). This 


implies that the hrst two moments of this distribution are 


E{R±){n)^2\l-, and E[{{R± - E{R±)Y]{n) {2 - A/T:)n. 


2.2 Limit of small relative drift 


Analytical results are harder to obtain in the case of non-zero drift (c Y 0) since 
Sparre Andersen theorem requires the full knowledge of all convolutions of the 
elementary increments. Denoting the standard deviation of the increments by 
good approximations of the expected number of upper records are known for 


a 


Gaussian increments in the limit of small relative drift, i.e., when c/a <C 1 and 


n ^ 1 while cn/a <C 1 (Wergen, Bogner, and Krug 2011 Majumdar, Schehr, and 
Wergen |2012 ): 


/ Ti cY~2 

K(i?+)(c/cj,n) ~ 2w —I- rnarctan(-yn) — y/n\ . 

' TT CJ7r ^ ^ 


( 3 ) 


The case of heavy-tailed increments with finite variance has not been thoroughly 
investigated. We will focus on Student’s t-distributions because of their abilities 
to reproduce both fat-tailed and Gaussian returns. They are known to describe the 
unconditional price return distribution (i.e., forgetting about volatility heteroskedas- 
ticity) (Bouchaud and Potters 2000; Longin 2005 Opdyke 2007) and innovations 
(see e.g. Bollerslev ( 1987[ )). Xet us therefore assume from now on that the price 
returns r^ are distributed according to a Student’s t-distribution of variance 
with 1 / degrees of freedom (we use this wording only to parametrize the return dis¬ 
tribution), denoted by P{r). Sparre Andersen theorem requires the knowledge of 
the n-time convoluted return distribution, denoted by P^^\r), of which no explicit 
expression exists for generic values of n and i'. In passing, P^^l{r) can be explicitly 
computed for any value of n provided that v is odd but the expressions quickly 


become cumbersome as n grows (Nadarajah and Dey 2005). This is why we shall 
resort to approximations. 

Appendix A reports approximate analytical results for the case = 3, i.e., for a 
tail exponent of 40 The resulting expected number of upper records becomes, in 
the same limit c/cr <C 1 and 1 while cn/a <C 1, 

E{R^){n,c/a) ~ —^ [narctan(-v/n) — V^] 


^/tt aiT 


c 8 _ / , 1 LI 

H- =—aVn atanh-t /1- \ 1 - 


( 4 ) 


Although a hrst order expansion, Eq. @ is not very accurate even in the limit of 
small n(c/cr), because the approximations needed to obtain explicit equations are 
quite rough (see Fig. [^. However, it was worth computing it for several reasons. 


^This precise value is the only one for which analytical computations seem workable. It also happens to be 
in line with the average tail exponent of US equities daily and intraday price returns and De Vries| 

1991 Plerou et al.|19^ Bouchaud and Potters|2000 Longin|2005 I. 
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Figure 2. Excess number of records E'(i?+|c/c7, n) — E{R+\0,n) for biased random walks with Student-t 
increments (t/ = 3). Interrupted lines are theoretical predictions and continuous lines are from numerical 
simulations, c = 0.001, <7 = 1, averages over 10^ samples. 

First, it contains the correct dependence of E{R^){n, cja) on n for small Sharpe ra¬ 
tios, which means that one may use this functional form to fit numerical simulations. 
Second, the presence of the third term, due to the difference between Gaussian and 
t-distributions at the origin, correctly implies that the prices with positive trends 
and heavy tails (and small Sharpe ratios) have a larger expected number of price 
records, which emphasises the importance of accounting for the tails of price return 
distributions when using price records to estimate Sharpe ratios (cf. section . 

Appendix B contains the derivation of E{R^)(n, c/a) in the large relative drift 
limit, i.e., c/a ^ 1 and n 1. In this case, the expected number of records grows 
linearly. 

It is noteworthy that these limits do not unequivocally correspond to small and 
large Sharpe ratios, since the limits do not involve but n. The small effective 
drift limit can be rewritten as c/a^/n <C 1/y/n, which correspond to vanishingly 
small Sharpe ratios, of little relevance to Finance; there is no guarantee that the 
very large c/a limit corresponds to realistic situations. Thus, depending on both c/a 
and n, one may be close to either limit, or in a no limit’s land. As a consequence, 
the next section resorts to extensive numerical calibration. 


3. Moment-free Sharpe ratio estimator 


As shown by Majumdar, Schehr, and Wergen (2012), the expected number of records 
is a monotonous function of the ratio c/cr, hence, of the Sharpe ratio. In other words, 
there is a one-to-one correspondence between the two quantities. This implies that 
it is possible to estimate Sharpe ratios from the number of upper or lower records. 
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Figure 3. Schematic explanation of the idea behind the permutation estimator of Sharpe ratios: one 
computes the difference between total drawup and drawdown durations, or equivalently, the number of 
jumps of the running maximum (dashed lines) and the number of jumps of the running minimum (dotted 
lines) of the cumulated sums of the sample values, averaged over many random permutations of the price 
returns. By convention, the first point counts as a first record for both the running maximum and minimum. 


More precisely, we have 


E{R^) = F+{c/a), (5) 

which implies that 

{^)=F-\R+). (6) 


Because of the lack of exact results, we shall use numerical simulations to calibrate 

F. 

The main problem of a number of records is that it is an integer number by deh- 
nition, which yields an estimator with unacceptable precision for short time series. 
The fundamental idea of the r-statistics (Challet 2015), in this context, consists in 
assuming that its log returns are i.i.d.. In that case, one may build many other log 
price paths based on random permutations of the original returns and thus measure 
the average number of records of the cumulated sums over many permutations (see 
Fig-IU) (this scheme may be extended to correlated time series). Mathematically, 
denoting the random permutation of index i G {1, • • • , re} by 7r(i) and the ensemble 
of all permutations by 11, the average number of records is R+ = S-n-en ^+,Tr 

where R+^n is the number of upper records of Sn,TT = Ylm=i ^ 7 r(m)- practice, one 
restricts computations to a subset of 11 for the sake of computational tractability, 
which has little influence on the end result; in this study, we have used 1000 random 
permutations. The new Sharpe ratio estimator is then based on Rq = R^ — R-. 
More precisely, the idea is to first calibrate the relationship F{Rq) = FQ{c/a,n) at 
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Figure 4. Efficiency of the record-based estimator 9q relative to that of the vanilla estimator, defined by 
the ratio of the variance of the new estimator and the usual one 9s as a function of the true Sharpe ratio 
c/(7 of the synthetic data. Averages over Navg = 10® samples per point; record numbers have been averaged 
over 1000 permutations; left plot: Student-distributed increments with tail exponent set to 4; right plot: 
Gaussian increments. 


fixed cja for a given distribution of synthetic price returns. Denoting 0 = c/a, one 
then inverts this relationship to obtain 


e = F-\Ro,n). (7) 

Estimation in the rest of this paper is based on extensive numerical simulations to 
establish the relationships E{Rq) as a function of parameters n, 6, and the Student 
parameter v. We chose v = {2.5, • • • , 10} with increments of 0.1, 10 < n < 375 by 
steps of 5 and n = 504; we take 31 values of 0 G [0.001,1] growing according to 
a geometric series. For each triple {n,9,v), we generate Navg synthetic time series, 
estimate Rq over 1000 random permutations for each time series and then average 
i?o over the Navg time series. Splines are then used to fit and invert the relationships 
of Eq. 0. 


3.1 Efficiency 

Moment-based estimators have a hard time with heavy-tailed data. It is thus clear 
that their precision, i.e., efficiency, suffer from heavy tails. The new estimator, on 
the other hand, is likely to be less affected by the latter. 

In order to compare their respective efficiency, let us denote by Oq the Sharpe 
ratio inferred from Rq. The standard deviation of Oq, denoted by ag, is obtained 
by the method of Deltas, i.e., from the relationship ag = aR-aEokl^ where an is 

the standard deviation of Rq] the numeric derivative of E{Ro\n,9) was computed 
numerically with splines. The relative efficiency of 9o with respect to the straight¬ 
forward estimator 9s = fl/a is then defined as p = cj|/cj^ where as is the standard 
deviation of 9s- Left plot of Fig. [^reports the relative efficiency of 9q for various n 
for Student’s t-distributed returns and v = A. The new estimator is unambiguously 
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N=252 



V 


Figure 5. True Sharpe ratio versus Student tail index v for Rq = 40 and 50 (circles and lozanges, 
respectively) for time series of length 252. The continuous lines are least-squares regressions to Eq. §• 
The horizontal dashed line reports the value of the true Sharpe ratio for Gaussian increments {u oo) 
and Rq = 40, while the vertical one stands aX u = 5.2, which is the threshold at which the respective rank 
switches if a time series with Rq = 40 and u = oo is compared with one with Rq = 50 and with exponent u. 


more powerful than the vanilla estimator. This result holds as long as the returns 
are heavy tailed. 

Financial price returns are not heavy tailed all the time. Thus it is important to 
check the efficiency of record statistics for log prices with Gaussian increments. Since 


the vanilla estimator is asymptotically optimal in this case (Neyman and Pearson 


1933), any other estimator is bound to be less efficient for large n. The right hand 


side plot of Fig. plots the relative efficiency of Oq for Gaussian increments, which 
depends on c/a. Remarkably, Oq may be slightly more efficient than the t-statistics 
itself for small n. 


3.2 Dependence on Student tail exponent 

Although only the u = 2> was studied analytically above, as it leads to workable 
expressions, the relationship between E{Rq) and the Sharpe ratio of increments 
with a Student’s t-distribution depends on n. As a consequence, at fixed n and Rq, 
the estimated Sharpe ratio also depends on n. Extensive numerical simulations (see 
Fig. with Navg = 10^ show that, at fixed Rq and n, 

EXe) = a{Ro, n) - b{Ro, n)z/-3/2, (8) 

where a{Ro,n) = E^{6) corresponds by definition to the average (and unbiased) 
Sharpe ratio of a process with Gaussian increments [v —?• oo). 

In addition to providing a simple way to extend the inference of the Sharpe ratio 
for arbitrary large values of v from a finite interval of v, this equation quantifies 
the bias of an estimation of Sharpe ratios if one neglects the effect of non-Gaussian 
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returns. Indeed, the estimation of Ro does not require any assumption about the 
underlying increment distribution, only the connection to the Sharpe ratio does. 
This means that estimating Sharpe ratios requires estimating u, and that assuming 
v = oo, as the vanilla method does, overestimates the true value of the Sharpe ratio 
as soon as the price return distribution has fatter tails than a Gaussian one. As a 
consequence, results about the equivalence of ranking from various methods only 
hold if all the assets have the same tail exponent. This point is further discussed in 
section I4T1 


3.3 Simplified estimation 

Equation Q provides a great complexity reduction in the estimation of the Sharpe 
ratio, but estimation may be further simplified by studying the dependence of both 
a and b on Rq and n. We choose values of Rq from 1 to n, and inferred the cor¬ 
responding values of 9 thanks to the calibrated relationships of Eq. 0 . Then, at 
hxed n, we choose a value Rq and perform the non-linear fit of Eq. Q, which yields 
a{Ro,n) and b{Ro,n). We then hlter out fits whose p-value associated with b is 
larger than 0.01 and whose average square residual is larger than 0 . 1 , which only 
happens in regions with large n and small Rq (in other words, where 0 = 0 is a 
fairly good approximation of the true Sharpe ratio). This leaves 13282 values of a 
and b, one for each remaining couple {Rq , n). 

Let us start with a = Eoo{9). Left plot of Eig. shows that a(i?o,n) = a{Ro/n) 
for n > 100: the collapse, while not perfect, is remarkable (there are 12645 points in 
this hgure). In other words, Rq ~ jn with hxed 7 at least for n > 100 and 9 > 0.001 
(a t-statistics of 0.01), as in the large 9n limit, although n9 = 0.1 in this case is 
far from being large. Eigure also makes it clear that 1 — Rq/N decreases faster 
than an exponential, which makes sense since it asymptotically follows a Gaussian 
function (cf. Appendix B). Note that the scaling Rq oc n assumes that price returns 
do have a trend. In other words, the Sharpe ratio in the region where Rq oc y/n will 
be under-estimated, but they correspond to negligible trends. 

Let us now turn to b{RQ, n). It turns out that there is a linear relationship h ~ 8/3a 
in the region a < 1 (see the right plot of Eig. [^, the collapse being remarkable. This 
region is relevant to Einance: for example, if o = 1, and n = 100, the t-statistics 
would be 10, a rarity. Thus, the whole calibration may rest on the determination of 
a{RQ/n), since 


9 ~ a{RQ/n) 


1 



(9) 


As a is a smooth function, we first round Rq/u to a precision of 0.01, and com¬ 
pute the average of a{r) where r is the rounded value of Rq/u. Einally, a spline is 
calibrated on this coarse-grained relationship, with the additional the coordinates 
(0,0) for the sake of convergence for very small values of Rq/h. Thus, simple scaling 
arguments made it possible to build an numerical estimation method for any n, v 
and Rq that rests on a single function. 
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Figure 6. Left plot: collapse plot showing the scaling relationship between a = Eoo{0) and Rq/tl for 12465 
couples (Ro^n)^ with 100 < n < 504; the y axis is in log scale. Right plot: 5 as a function of a, and a linear 
fit 6 ~ 2.67a for a < 1; same parameters as in the left plot. 


4. Application to real data 


The i.i.d. assumption is totally unrealistic regarding asset price returns, if only 
because of volatility heteroskedasticity. Applying straightforwardly the above esti¬ 
mator would therefore make little sense on long time series. The approach followed 
here is to consider smaller time windows and to assume that stationarity approxi¬ 
mately holds in each time window. The second current limitation of the proposed 
estimator to keep in mind here is that it does not account explicitly for skewness. 
At any rate, this section is meant to provide a clear illustration of how different the 
estimates of both methods may be. 

In order to find the corresponding Sharpe ratio, we assume that price returns are 


conditionally leptokurtic (Bollerslev 1987): in each time window, we fit the returns 


with Student’s t-distribution by maximum likelihood and obtain an estimate i) and 
use Eq. Q. 

Figure]^ shows the difference between annualized Sharpe ratios of SPY estimated 
with the new and vanilla estimator. When ly is larger that 10, both estimators yield 
almost the same Sharpe ratio, as expected from Eq. On the other hand, when 
tails are heavier, i.e. when u < 10, the two estimates significantly differ. Indeed, the 
new term in Eq. Q with respect to Eq. ^ implies that vanilla estimates are too 
large in absolute values. This is confirmed in Fig. The difference between both 
estimates is very large in leptokurtic times, e.g. in 2008 and 2009; in addition, in 
these difficult times, the new estimator is clearly less volatile, which is in line with 
its better efficiency. 

As a side note, the fact that the moment-based method overestimates the Sharpe 
ratio (and the t-statistic) in leptokurtic times means that using it for trading 
purposes leads to taking wrong trading decisions more often (the power of the r- 


statistics is indeed much larger than that of t-statistics for heavy-tailed data (Challet 


2015| )). Let us try the following naive trading strategy (without transaction costs): 
whenever the estimated annualized Sharpe ratio is larger than 1 in absolute value 
in the last 100 close-to-close price returns, one takes a long or short single-day po¬ 
sition, depending on the sign of the Sharpe ratio (with a one day lag). We use an 
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Figure 7. Left plot: parametric fit of the number of degrees of freedom of Student’s t-distribtuion in 
a sliding window of 252 close-to-close returns of SPY. Right plot: estimated Sharpe ratios with the new 
estimator and from a vanilla estimation. 1000 permutations have been used to estimate Rq. 


SPY 



Figure 8. Discrepancy of the estimates of the annualized Sharpe ratio of SPY with moving time windows 
of 252 days between the new and the vanilla estimators. 1000 permutations have been used to estimate Rq. 


unbiased historical data set of US equities (1995-01-01 to 2015-06-30) and focus on 
liquid assets, i.e. whose price is larger than $20 and 60-day rolling median daily 
volume is larger than 250000 shares. Figure reports the cumulated performance 
of this strategy when applied to all 3449 US equities for the period . The difference 
of performance between the two methods is marked in times of large fluctuations 
(e.g. 2008). Note that the y-axis of this plot is logarithmic so as to avoid fooling 
the reader (McLean 2011[ ); in addition, it should be noted that the out-of-sample 
performances plotted in Fig. is the result of 3449 decisions at each time step, i.e., 
very many decisions. As a consequence, the origin of the difference of performance 
between the new and vanilla methods is the relative power of the related statistical 
tests (Challet 2015), not an erroneous way of computing compounded returns. 
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Figure 9. Cumulated performance of a trading strategy consisting in investing when the estimated annu¬ 
alized Sharpe ratio is larger than 1 in absolute value; short positions are allowed in case of negative Sharpe 
ratio; estimates over rolling windows of 100 trading days (close-to-close price returns). Unbiased historical 
database of 3449 US liquid equities. 1000 permutations have been used to estimate Rq. 


4.1 Ranking assets 


It is worth discussing the relevance and practical value of the proposed Sharpe 
ratio estimator beyond its much improved precision in leptokurtic times and the 
appreciable fact that it is unbiased. The industry is not only interested in the 
actual Sharpe ratio values of a group of assets (stocks, hedge funds, etc.), but also 
in ranking them. Thus, an important question is to assess whether the new method 
ranks assets in a different way than vanilla Sharpe ratio estimation. If this is clearly 
not the case, by extension, the new method brings a valuable alternative way to 
rank assets. 

Two preliminary remarks. First, the fact that both methods estimate the same 
quantity, but that one method is clearly much more efficient for non-Gaussian vari¬ 
ables implies that the correlation of the asset ranks cannot be 1 because of the 
greater fluctuations of the vanilla estimator. Second, as explained above, the Sharpe 
ratio corresponding to an estimated Rq depends on the tail index of the Student 
distribution. Since the new method is unbiased and the vanilla one is biased for 
heavy-tailed distribution, and since the estimated tail exponents at any given time 
will vary from asset to asset, one cannot expect the two methods to yield on average 
equivalent ranking, even asymptotically. In other words, the location-shape argu¬ 


ment of Schuhmacher and Fling (2011) does not hold for assets with heterogeneous 
tail indices, as noted e.g. in Zakamulin ( 2010| . 

Let me take an example; looking once again at Fig. makes it clear that the 
ranking of Rq, i.e., of the vanilla estimation method, may not be the same one as 
the ranking according to the new method. Assume that asset 1 has i?o,i = 40 and 
asset 2 i?o ,2 = 50; neglecting the fact that < oo and 1^2 < co is quite possible, 
the vanilla method attributes a better rank to asset 2. Now, say vi = 10; as soon as 
1^2 < 4.4, asset 1 must be attributed a better rank than asset 2 (provided that the 
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1995 1999 2002 2005 2008 201 1 2014 


Figure 10. Left: evolution of the fraction of common assets between rankings for the moment-free and 
vanilla Sharpe ration estimation methods; black lines: top 5% positive ratios, red lines: smallest 5% ratios; 
calibration over 100 days; unbiased historical database of 3449 liquid US equities. Right: effective measure 
of average (red lines) and standard deviation (black lines) of tail exponents as a function of time for the 
3449 US equities. 


estimations of the tail exponents are precise enough). 

All the above shows the crucial role of v and sheds a new light on the debate 
on whether all risk measures are equivalent asymptotically or not. The point is 
that many of them may be biased for non-Gaussian variables in an equivalent way. 
For example, Auer and Schuhmacher (2013b) find that the rankings of hedge fund 
performance with the largest Sharpe ratios are most stable. But its results rest on 
measures that overestimate Sharpe ratios for heavy-tailed returns; thus, as sizeable 
fraction of these large Sharpe ratios may simply be the results of the methods’ 
biases. 


Let me first focus on the 5% best and worst estimated Sharpe ratios. Figure 10 


(left plot) shows that the fraction of common assets in these centiles is significantly 
different from 1, except notably for positive Sharpe ratios in 2008. As expected, this 
fraction decreases when the heterogeneity of tail exponents increases (right plot). 

As expected, rankings differ more for assets with a small i.e., with heavy tails. 
Left plot of Fig. [^reports the time evolution of Spearman and Kendall rank cor¬ 
relation of all the assets for both methods. As in Auer and Schuhmacher (2013b), 
an alternative rank correlation measure ( Blest]2000 Genest and Plante||2003 ), more 
sensitive to the rank of the largest values of data, is displayed in the right hand side 
of this figure; it is slightly above zero on average, with large fluctuations in 2003 
and 2009, which echoes the decrease of both Spearman and Kendall correlations at 
those dates, but not at other dates. Thus, ranking equities with usual moment-based 
methods or the new moment-free method may yield very significantly different re¬ 
sults in the case of daily price return of equities. Hedge fund performance returns 
have a monthly resolution and are thus much closer to Gaussian variables, owing to 
the central limit theorem (see e.g. Bouchaud and Potters (2000)), which may also 
explain why previous studies did not find striking differences of ranking for most 
performance measures. 


14 











































February 9, 2017 


Applied Mathematical Finance pricerecords'student’amf 



Figure 11. Spearman, Kendall (left plot) and Genest-Plante-Blast (right plot) rank correlations between 
the ranks of the Sharpe ratios obtained by record statistics and the usual method as a function of time 
(black lines: positive Sharpe ratios, red lines: negative Sharpe ratios). Sliding calibration windows of 100 
days. 


Discussion 


The proposed Sharpe ratio estimator is robust, efficient, and well-behaved as it 
does not rely on moment estimation. Large returns are not regarded as outliers, but 
contribute to record statistics in a smooth way. In addition, a real outlier (due e.g. 
to a data error, or a neglected corporate action) may only create a single spurious 
additional price record, while two outliers of the same magnitude and opposite signs 
have only a mild influence on Rq. Finally, the robustness of the estimator lies in 
the fact that the latter is based only on the duration of drawdowns, not on their 
amplitudes. This is to be contrasted with other quantities related to drawdowns. 
For example the expectation of the maximum drawdown of a Brownian motion is a 
known function of the Sharpe ratio ( Magdon-Ismail et al.|2003 ), but is very sensitive 
to outliers by definition. 

Because of the lack of exact results, using this estimator requires for the time 
being numerical calibration, which has been much simplified by scaling arguments. 
Estimating Sharpe ratios with price record/drawdown statistics is not limited to 
Student’s t-distributed returns, as indeed one may calibrate their relationships for 
any return distribution with finite variance. In addition, the method introduced in 
this paper provides a generic way to build many types of estimators with record 
statistics as long as the relationship between price record statistics and the measur¬ 
able to estimate is monotonic. For example, it may be used to estimate the drift of 
a Levy process. 

The main limitations of the proposed estimator are the assumptions of i.i.d and 
symmetric increments. Both can be accounted for numerically for the time be¬ 
ing. An interesting challenge is to incorporate serial correlations into the analytical 
computation of record statistics: numerical results point to simple corrections in 


the case of AR(1) and GARCH(I,1) models (Wergen 2014). Practically, a way to 


respect return auto-correlation and volatility heteroskedasticity is to use a kind of 


block-bootstraps, as in Ledoit and Wolf (2008). 
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An R package entitled sharpeRratio implements the new estimator and is avail¬ 
able on CRAN at https://cran.r-project.org/web/packages/sharpeRratio/ 
index.html; a Python version is available at PyPi https://pypi.python.org/ 
pypi/pysharperratio/0.1.10, 

An interactive webpage which reproduces the plots of Section 3 for any asset 
symbol and time period may be found at 

https://brillant.shinyapps.io/moment-free_Sharpe_ratio . 


Appendix A. Expected number of records in the vanishing Sharpe ratio 
limit for Student’s t-distributed price returns 

In this limit, one may use a first order expansion of the reciprocal cumulative func¬ 
tion 


P{Sn > 0) = ^ + P(")(0)cn + 0([cn]2). 


(Al) 


One therefore needs to compute pO) (0). Since the increments are assumed to be 
independent, 


1 roo 1 roo 

= ^J ^ ^ y jmrdi, 

where is the characteristic function of and that of P^^\r) = 

P{r). Equation (Al) requires the computation of P^”'^(0), which is impossible for 
any n and However the = 3 case leads to workable expressions. One finds 
pW(o) = (n), where En{z) is the exponential integral function. The specific 

form n = —z oi the exponential integral function is easy to compute in a recursive 
way by integration by parts: 


e ^ k — n 
P—k—n\k) I (a) 


k 




E„(t) = —. 


Therefore, after some elementary computations, E-n{n) = 


11 ? * ^ S 

p(0(o) = 

crvr n s! 

s=0 


(A2) 


Using the asymptotic expansion Kn = 

■ 2 
■ \/27rn C77r VSn 


js=0 s! L2 ' Y Stt 

the usual Stirling expansion, Eq. (A2) becomes P’^”')(0) = -w=H- + 


and thus, in the case of small drifts, Eq. (Al) reads 
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Higher-order expansions of Kn and n! contribute terms of order that are 

negligible. It is noteworthy that the additional correction for Student increments 
does not depend on n; accordingly, it is relevant for any value of n and has a larger 
relative weight for smaller n; this is consistent with the fact that convolutions of 
Student’s t-distributions with v = 2> converge to a Gaussian distribution. Sparre 
Andersen theorem yields 


~q.{z) = 




1 + E 


c \ c 2 log(l — z) 

^ O' \/2Trn j o 7r\/3 y/1 — z 


+ 0[{c/of]. (A3) 


The generating function of the number of records is then (Le Doussal and Wiese 


2009) 


rh-\.{z) ~ 


(1 - z)3/2 


1 c V-—\ z 2c , . 

I + — > —j= -^ log(l - z) 

\fo (Tvr-v/S 


n=l 


The two hrst terms in the brackets are the same ones as those of Gaussian biased 


random walks (Majumdar, Schehr, and Wergen 2012). The third term is new and 


due to the difference between a Gaussian and a t-distribution at the origin. Whereas 
until this point the approximations are controlled, current literature on this topic 
goes one step further: asymptotic results are (very) roughly obtained by approx¬ 
imating divergent partial sums. This captures the way the sum diverges, without 
much control over the precision of the prefactor. At any rate, as the prefactor is not 
essential to our purpose, we have followed the same route, which yields 


I , / N 2 

log(l 

\/TT ^ ^ 
^ n>l 


(I - z)3/2 


2-v/n ^atanh-^I-— ^I - 


(A4) 


which is not a very good approximation even for large n but gives the correct 
asymptotic ^/n dependence, with an additional logarithmic corre ction brought by 

atanh^/l — 4 — — 4. Finally, approximating n by n — I as in 


Wergen, Bogner, 


and Krug (2011) and identifying each term of the generating function with the value 


of n one is interested in gives 


E{R+){c/o,n) ~ 


2-yn c\/2 

OTT 


[n arctan(-yn) — -vAi] 


c 8 
o \/37r3/2 


n 


^atanh 


I - 


I 

n 



(A5) 


Given its derivation, this formula is relevant in the limit cn o and large n, or 
equivalently clo\fn <C \l\/n^ i.e., in the limit of vanishingly small Sharpe ratios. 
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Figure Bl. Limiting no as a function of c/ct, from Eq. Convoluted Student’s t-distributions may be 

approximated by a Gaussian distribution below the continuous line, and by a power-law above this line. 


Appendix B. Expected number of price records in the large cn/cr limit 


The cn/a ^ 1, n ^ 1 limit also makes it possible to derive some analytical in¬ 
sights. Majumdar, Schehr, and Wergen (2012) give results for large, but not too 


large, cnja. Indeed, the central limit theorem states that the convergence of the 
distribution of convoluted variables to a Gaussian distribution occurs from the cen¬ 
ter of the distribution. This implies that the tails of any non-Gaussian distribution 
are non-Gaussian. Thus, intuitively, when cnja is large enough (whose meaning will 
be discussed below), P{xn < cn) comes from the non-Gaussian tails. This will lead 
to markedly different results for Student’s t-distributions since the tails of convo¬ 


luted t-distributions keep their power-law nature. Bouchaud and Potters (2000) give 


TnT 

an intuitive argument to compute the n-time convoluted return rg ' at which the 
Student and Gaussian parts of the distribution have equal importance and find that 
rg""^ ~ a\/n logn for v = ?>. This means that the value of uq at which the power-law 
tail starts to prevail is such that cno cr; a^/r^)\ognQ, i.e., 


- Cog no- 
a 


(Bl) 


Since the convoluted distribution has a continuous first derivative, there is no 
sharp transition between the Gaussian and power-law regimes, hence no only ap¬ 
proximately indicates where the Gaussian approximation begins to break down. 
Figure [ bT] plots no{c/a) and shows these two regions. In the region well below the 
line, a Gaussian approximation holds for Student convolutions. Reversely, when 
n » no{c/a), 


P{Sn < 0 ) ~ 



2a^n 


TTX^ 


dx 


(T\ 3 

c / 
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hence log{q-{z)) = ^ E thus 

n=l 


m+{z) = 


1 


(1 — KcJ 3 -k ^ rfi (1 — z)2 

^ L n=l J ^ 

Finally, one finds without major difficulty 


o-\3 2 ^ z' 
c) Svr rfi 

n=l 


1 


1 - 


c) Svr V? 

n=l 


n>0 t ^ ^ 


and 


m+(n) 


n 




Numerically, -ftT ~ 1.202 ~ | for large n; approximating sums with integrals yields 
the very different K = 1/2. Thus the number of records increases linearly for large 
n m+{n) ~ nfistudent with an asymptotic rate given by fj-student - 1 “ (f) to 

1 - nn 

be compared with ucauss — 1 — ^ ■ Figure B2 plots the difference of the 

c V 27r _ 

record rate between Gaussian- and Student’s t-distributed {v = 3) increments as 
a function of c/a. Whereas the number of records of random walks with Student 
increments are larger than those with Gaussian ones for small Sharpe ratios, Fig. 


B2 shows, somewhat surprisingly, that Gaussian increments lead to a larger rate of 
records for very large Sharpe ratios. 
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